Building Fast Performance Models for Loop-Free 64-bit x86 Code Sequences
نویسندگان
چکیده
منابع مشابه
Performance Characterization of the 64-bit x86 Architecture from Compiler Optimizations' Perspective
Intel Extended Memory 64 Technology (EM64T) and AMD 64-bit architecture (AMD64) are emerging 64-bit x86 architectures that are fully x86 compatible. Compared with the 32-bit x86 architecture, the 64-bit x86 architectures cater some new features to applications. For instance, applications can address 64 bits of virtual memory space, perform operations on 64-bit-wide operands, get access to 16 ge...
متن کاملSynchronization for fast and reentrant operating system kernel tracing
To effectively trace an operating system, a performance monitoring and debugging infrastructure needs the ability to trace various execution contexts. These contexts range from kernel running as a thread to NMI (Non-Maskable Interrupt) contexts. Given that any part of kernel infrastructure used by a kernel tracer could lead to infinite recursion if traced, and because most kernel primitives req...
متن کاملROP Compiler Jeff Stewart , Veer
When developing exploits for modern x86 64-bit systems, attackers must handcraft exploits for each binary. This involves finding a vulnerability (such as a stack-based buffer overflow) and diverting control flow (overwrite return address). Modern exploits employ Return-Oriented Programming (ROP) to bypass widely deployed defenses such as WˆX. Building a ROP chain requires manual effort to find ...
متن کاملLightweight Memory Tracing
Memory tracing (executing additional code for every memory access of a program) is a powerful technique with many applications, e.g., debugging, taint checking, or tracking dataflow. Current approaches are limited: software-only memory tracing incurs high performance overhead (e.g., for Libdft up to 10x) because every single memory access of the application is checked by additional code that is...
متن کاملRegister Pressure Guided Unroll-and-Jam
Unroll-and-jam is an effective loop optimization that not only improves cache locality and instruction level parallelism (ILP) but also benefits other loop optimizations such as scalar replacement. However, unroll-and-jam increases register pressure, potentially resulting in performance degradation when the increase in register pressure causes register spilling. In this paper, we present a low ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013